Spanish Automatic Text Enrichment

نویسندگان

  • Mariano Felice
  • Gabriel H. Tolosa
چکیده

Unlike text on paper, hypertext enables the linking of pieces of text with other texts and multimedia resources, which not only improves the way we read but also lays the foundation for new information systems. Specifically, the proliferation of collaborative sites, blogs, online databases, encyclopedias and many other services on the World Wide Web provides an invaluable source of up-to-date information which can be used to aid reading comprehension. As a result, an approach to the automatic extraction, merging and integration of online information is proposed for the purpose of “enriching” texts. This unprecedented text enrichment process allows users to transform ordinary plain texts into self-explanatory hypertexts containing contextual information and resources selected automatically from the Web. Application of such an enrichment process could help students in their scholarly reading, provide users with related multimedia resources and avoid multiple searches for concepts and entities mentioned in a text, among other purposes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Syntactic Analysis for Detection of Word Combinations

The paper presents a method for automatic detection of “non-trivial” word combinations in the text. It is based on automatic syntactic analysis. The method shows better precision and recall than the baseline method (bigrams). It was tested on a text in Spanish. The method can be used for enrichment of very large dictionaries of word combinations.

متن کامل

Automatic Recovery of Punctuation Marks and Capitalization Information for Iberian Languages

This paper shows experimental results concerning automatic enrichment of the speech recognition output with punctuation marks and capitalization information. The two tasks are treated as two classification problems, using a maximum entropy modeling approach. The approach is language independent as reinforced by experiments performed on Portuguese and Spanish Broadcast News corpora. The discrimi...

متن کامل

A Hybrid System for Spanish Text Simplification

This paper addresses the problem of automatic text simplification. Automatic text simplifications aims at reducing the reading difficulty for people with cognitive disability, among other target groups. We describe an automatic text simplification system for Spanish which combines a rule based core module with a statistical support module that controls the application of rules in the wrong cont...

متن کامل

Monolingual and bilingual dictionary approaches to the enrichment of the Spanish WordNet with adjectives

We report on two different approaches to the incorporation of adjectives in Spanish WordNet based on automatic extraction techniques using EuroWordNet and machine-readable dictionaries. We show that a monolingual dictionary approach enables to exploit relations between different parts of speech and enrich the internal structure of the Spanish WordNet, while the methods based on bilingual dictio...

متن کامل

Automatic Simplification of Spanish Text for e-Accessibility

In this pa per we present an automatic text simplification system for Spanish which intends to make texts more accessible for users with cognitive disabilities. This system aims at reducing the structural complexity of Spanish sentences in that it converts complex sentences in two or more simple sentences and therefore reduces reading difficulty.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009